“Principal component analysis (PCA), Principal component regression (PCR), and Sparse PCA in R”
SO2: SO2 content of air in micrograms per cubic metre; temp: average annual temperature in degrees Fahrenheit; manu: number of manufacturing enterprises employing 20 or more workers; popul: population size (1970 census) in thousands; wind: average annual wind speed in miles per hour; precip: average annual precipitation in inches; predays: average number of days with precipitation per year.
rm(list = ls())
#packages <- c("HAUSR2", "elasticnet")
#if (length(setdiff(packages, rownames(installed.packages()))) > 0) {
# install.packages(setdiff(packages, rownames(installed.packages())), dependencies = TRUE)
#}
library(HSAUR2)## Loading required package: tools
## SO2 temp manu popul wind precip predays
## Albany 46 47.6 44 116 8.8 33.36 135
## Albuquerque 11 56.8 46 244 8.9 7.77 58
## Atlanta 24 61.5 368 497 9.1 48.34 115
## Baltimore 47 55.0 625 905 9.6 41.31 111
## Buffalo 11 47.1 391 463 12.4 36.11 166
## Charleston 31 55.2 35 71 6.5 40.75 148
## The following object is masked from package:datasets:
##
## precip
To begin we shall ignore the SO2 variable and concentrate on the others, two of which relate to human ecology (popul, manu) and four to climate (temp, Wind, precip, predays). A case can be made to use negative temperature values in subsequent analyses since then all six variables are such that high values represent a less attractive environment. This is, of course, a personal view, but as we shall see later, the simple transformation of temp does aid interpretation.
Quelle: An Introduction to Applied Multivariate Analysis with R by Brian Everitt, Torsten Hothorn p86
USairpollution$negtemp <- temp*(-1)
USairpollution <- USairpollution[,c(1,8,3:7)]
attach(USairpollution)## The following objects are masked from USairpollution (pos = 3):
##
## manu, popul, precip, predays, SO2, wind
## The following object is masked from package:datasets:
##
## precip
## SO2 negtemp manu popul wind precip predays
## Albany 46 -47.6 44 116 8.8 33.36 135
## Albuquerque 11 -56.8 46 244 8.9 7.77 58
## Atlanta 24 -61.5 368 497 9.1 48.34 115
## Baltimore 47 -55.0 625 905 9.6 41.31 111
## Buffalo 11 -47.1 391 463 12.4 36.11 166
## Charleston 31 -55.2 35 71 6.5 40.75 148
Check the sample correlation matrix and do the matrix scatterplot
## negtemp manu popul wind precip predays
## negtemp 1.00 0.19 0.06 0.35 -0.39 0.43
## manu 0.19 1.00 0.96 0.24 -0.03 0.13
## popul 0.06 0.96 1.00 0.21 -0.03 0.04
## wind 0.35 0.24 0.21 1.00 -0.01 0.16
## precip -0.39 -0.03 -0.03 -0.01 1.00 0.50
## predays 0.43 0.13 0.04 0.16 0.50 1.00
QoL: Quality of Life
“first component might be regarded as some index of “quality of life”, with high values indicating a relatively poor environment (in the authors’ opinion at least).
The second component is largely concerned with a city’s rainfall having high coefficients for precip and predays and might be labelled as the “wet weather” component.
Component three is essentially a contrast between precip and negtemp and will separate cities having high temperatures and high rainfall from those that are colder but drier. A suitable label might be simply “climate type”."
## Importance of components:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
## Standard deviation 1.4819456 1.2247218 1.1809526 0.8719099 0.33848287
## Proportion of Variance 0.3660271 0.2499906 0.2324415 0.1267045 0.01909511
## Cumulative Proportion 0.3660271 0.6160177 0.8484592 0.9751637 0.99425879
## Comp.6
## Standard deviation 0.185599752
## Proportion of Variance 0.005741211
## Cumulative Proportion 1.000000000
##
## Loadings:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
## negtemp 0.330 0.128 0.672 0.306 0.558 0.136
## manu 0.612 -0.168 -0.273 0.137 0.102 -0.703
## popul 0.578 -0.222 -0.350 0.695
## wind 0.354 0.131 0.297 -0.869 -0.113
## precip 0.623 -0.505 -0.171 0.568
## predays 0.238 0.708 0.311 -0.580
## Comp.1 Comp.2 Comp.3
## Albany -0.538945616 0.79206938 1.36260519
## Albuquerque -1.417093145 -2.86589024 1.27540034
## Atlanta -0.598994755 0.58723563 -0.99541128
## Baltimore 0.509380032 0.02875397 -0.36359953
## Buffalo 1.390708909 1.88030104 1.77619191
## Charleston -1.429753320 1.21058365 -0.07944350
## Chicago 6.513881753 -1.66838172 -2.28629587
## Cincinnati -0.508220797 0.48601020 -0.26618169
## Cleveland 1.766279326 1.03942658 0.74664124
## Columbus -0.118835030 0.64039358 0.42302559
## Dallas -0.006878799 -1.21181090 -0.99830140
## Denver -0.207434109 -1.96320936 1.26621359
## Des Moines -0.131980292 -0.06120607 1.65010856
## Detroit 2.166694857 -0.27060292 0.14704284
## Hartford -0.219106349 0.97630584 0.59461329
## Houston 0.508419801 -0.11271579 -1.99360553
## Indianapolis 0.308324007 0.36049244 0.28541539
## Jacksonville -1.227849872 0.84912801 -1.87611109
## Kansas City -0.131303464 -0.25205976 0.27549603
## Little Rock -1.611599761 0.34248684 -0.83970812
## Louisville -0.423739264 0.54055336 -0.37446946
## Memphis -0.577848813 0.32506654 -1.11488311
## Miami -1.533160553 1.40469861 -2.60660585
## Milwaukee 1.391024518 0.15761872 1.69127813
## Minneapolis 1.500032786 0.24675569 1.75081328
## Nashville -0.910052287 0.54346445 -0.85923113
## New Orleans -1.453857066 0.90075225 -1.99187430
## Norfolk -0.589141903 0.75219052 -0.06092797
## Omaha -0.133637672 -0.38478358 1.23581021
## Philadelphia 2.797074676 -0.65847826 -1.41511547
## Phoenix -2.440096802 -4.19114925 -0.94155229
## Pittsburgh 0.322264830 1.02663680 0.74808213
## Providence 0.069936477 1.03390711 0.88774002
## Richmond -1.171998946 0.33494902 -0.50862036
## Salt Lake City -0.912393969 -1.54734758 1.56510204
## San Francisco -0.502073845 -2.25528717 0.22663991
## Seattle 0.481679438 1.59742576 0.60871204
## St. Louis 0.286187330 -0.38438239 -0.15567183
## Washington -0.022928417 -0.05456742 -0.35387289
## Wichita -0.196823158 -0.67607441 1.13122963
## Wilmington -0.996140738 0.50074082 0.43332129
par(pty="s")
plot(usair.pc$scores[,1],usair.pc$scores[,2],
ylim=range(usair.pc$scores[,1]),
xlab="QoL",ylab="Wet weather",type="n",lwd=2)
text(usair.pc$scores[,1],usair.pc$scores[,2],
labels=abbreviate(row.names(USairpollution)),cex=0.7,lwd=2)par(pty="s")
plot(usair.pc$scores[,1],usair.pc$scores[,3],
ylim=range(usair.pc$scores[,1]),
xlab="QoL",ylab="Climate type",type="n",lwd=2)
text(usair.pc$scores[,1],usair.pc$scores[,3],
labels=abbreviate(row.names(USairpollution)),cex=0.7,lwd=2)par(pty="s")
plot(usair.pc$scores[,2],usair.pc$scores[,3],
ylim=range(usair.pc$scores[,2]),
xlab="Wet weather",ylab="Climate type",type="n",lwd=2)
text(usair.pc$scores[,2],usair.pc$scores[,3],
labels=abbreviate(row.names(USairpollution)),cex=0.7,lwd=2)par(mfrow=c(1,3))
plot(usair.pc$scores[,1],SO2,xlab="PC1")
plot(usair.pc$scores[,2],SO2,xlab="PC2")
plot(usair.pc$scores[,3],SO2,xlab="PC3")##
## Call:
## lm(formula = SO2 ~ usair.pc$scores[, 1] + usair.pc$scores[, 2] +
## usair.pc$scores[, 3])
##
## Residuals:
## Min 1Q Median 3Q Max
## -36.420 -10.981 -3.184 12.087 61.273
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 30.049 2.907 10.336 1.85e-12 ***
## usair.pc$scores[, 1] 9.942 1.962 5.068 1.14e-05 ***
## usair.pc$scores[, 2] 2.240 2.374 0.943 0.352
## usair.pc$scores[, 3] -0.375 2.462 -0.152 0.880
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.62 on 37 degrees of freedom
## Multiple R-squared: 0.4182, Adjusted R-squared: 0.371
## F-statistic: 8.866 on 3 and 37 DF, p-value: 0.0001473
Another data set: “mnist_train.txt”
We only use the hand-writing number 3
## [1] 60000 785
## [1] 6131 785
number_mat <- matrix(as.numeric(number_3[3,-1]), 28, 28, byrow = T)
mat1 <- apply(number_mat, 2, rev)
image(1:28, 1:28, t(mat1), axes = FALSE, xlab = "", ylab = "",
col = grey(seq(1, 0, length = 255)))Some data reduction:
number_ <- number_3[,which(!apply(number_3,2,FUN = function(x){all(x == 0)}))]
number_$V1 <- NULL
dim(number_)## [1] 6131 585
cor=FALSE indicates that we use covariance matrix here instead of correlation matrix, because the correlation matrix can only be used of there are no constant variables. But in such hand-writing data set, it’s more likely to include a constant variable.
Loadings of the first four PCs:
loadings_pca <- cbind(round(out_pca$loadings[,1], digits = 4),
round(out_pca$loadings[,2], digits = 4),
round(out_pca$loadings[,3], digits = 4),
round(out_pca$loadings[,4], digits = 4))
# proportion of explained variance
out_pca$sdev["Comp.1"]^2/sum(out_pca$sdev^2)## Comp.1
## 0.124696
## Comp.2
## 0.09268516
## Comp.3
## 0.07955255
## Comp.4
## 0.05490474
Extract the loadings of the factors only in the individual variables here
used_pca <-which(names(number)%in%c(colnames(number_)))
pc <- matrix(0, nrow = 4, ncol = 784)
v <-1
for (i in used_pca) {
pc[,i] <- loadings_pca[v,]
v <- v+1
}
number_mat <- matrix(as.numeric(number_3[3,-1]), 28, 28, byrow = T)
mat1 <- apply(number_mat, 2, rev)
col <- colorRampPalette(c("white", "blue"))
col2 <- colorRampPalette(c("white", "red"))
for (i in 1:4) {
pca_mat <- matrix(as.numeric(pc[i,]), 28, 28, byrow = T)
mat_pca <- apply(pca_mat, 2, rev)
#blue: positive loadings, red: negative loadings)
A <- t(mat1)
B <- (A>1) * t(mat_pca)
B[ B<=0.0 ] <- NA
image(1:28, 1:28, A, axes = FALSE, xlab = "", ylab = "", col = grey(seq(1, 0, length = 255)))
image(1:28, 1:28, B, axes = FALSE, xlab = "", ylab = "",
col= col(10), add=T)
A <- t(mat1)
B <- (A>1) * t(mat_pca)
B[ B>=0.0 ] <- NA
image(1:28, 1:28, B, axes = FALSE, xlab = "", ylab = "",
col= col2(10), add=T)
B <- t(mat_pca)
B[A>1] <-NA
B[ B>=0.00 ] <- NA
image(1:28, 1:28, B, axes = FALSE, xlab = "", ylab = "",
col= col(10), add=T) # overlay
B <- t(mat_pca)
B[A>1] <-NA
B[ B<=0.00 ] <- NA
image(1:28, 1:28, B, axes = FALSE, xlab = "", ylab = "",
col= col2(10), add=T) # overlay
}The first four principle components barely explains the whole information here
We can’t see the clear structure of the negative or positive loadings of the the principle component on the individual variables, if we just look at 20% of explained variances. We would use some alternative techniques that would not be covered in this lecture
Important point: We need to understand what the principle component analysis does. For what reason we could do it
We consider the concept of manifest and latent variables
#if(!require(c(MASS, klaR))){
# install.packages(c("psych", "corrplot", "GPArotation"))
# library(MASS)
#}
library(MASS)
# import data
bfi <- readRDS("bfi.rds")
head(bfi)## A2 A3 A5 C1 C2 C3 C4 C5 N1 N2 N3
## 61617 4 3 4 2 3 3 4 4 3 4 2
## 61618 4 5 5 5 4 4 3 4 3 3 3
## 61620 4 5 4 4 5 4 2 5 4 5 4
## 61621 4 6 5 4 4 3 5 5 2 5 2
## 61622 3 3 5 4 4 5 3 2 2 3 4
## 61623 6 5 5 6 6 6 1 3 3 5 2
##
## Attaching package: 'psych'
## The following object is masked _by_ '.GlobalEnv':
##
## bfi
## corrplot 0.84 loaded
Some useful functions regarding to factor analysis
### factanal() - stats
#?factanal
# - Estimation only ML
# - Rotation: Varimax (orthogonal) & Promax (oblique)
### fa() - psych
#?fa
# - Estimation: 'minres' - Minimum Residual
# 'ml' - ML
# 'pa' - Principle Axes
# sowie einige weitere Methoden
# - Rotation: huge variety## Factor Analysis using method = ml
## Call: fa(r = bfi, nfactors = 4, rotate = "oblimin", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML3 ML4 ML1 ML2 h2 u2 com
## A2 0.61 0.07 0.03 0.02 0.38 0.62 1.0
## A3 0.80 -0.03 -0.02 0.04 0.63 0.37 1.0
## A5 0.62 0.00 -0.01 -0.12 0.43 0.57 1.1
## C1 -0.01 0.64 0.05 -0.02 0.38 0.62 1.0
## C2 0.02 0.69 0.01 0.05 0.47 0.53 1.0
## C3 0.06 0.44 -0.14 0.01 0.29 0.71 1.3
## C4 0.01 -0.44 0.26 0.12 0.41 0.59 1.8
## C5 -0.01 -0.01 0.91 0.00 0.85 0.15 1.0
## N1 -0.01 -0.05 -0.07 0.87 0.74 0.26 1.0
## N2 -0.01 0.04 0.04 0.82 0.68 0.32 1.0
## N3 0.03 0.04 0.10 0.64 0.44 0.56 1.1
##
## ML3 ML4 ML1 ML2
## SS loadings 1.42 1.36 1.03 1.89
## Proportion Var 0.13 0.12 0.09 0.17
## Cumulative Var 0.13 0.25 0.35 0.52
## Proportion Explained 0.25 0.24 0.18 0.33
## Cumulative Proportion 0.25 0.49 0.67 1.00
##
## With factor correlations of
## ML3 ML4 ML1 ML2
## ML3 1.00 0.25 -0.21 -0.16
## ML4 0.25 1.00 -0.49 -0.09
## ML1 -0.21 -0.49 1.00 0.31
## ML2 -0.16 -0.09 0.31 1.00
##
## Mean item complexity = 1.1
## Test of the hypothesis that 4 factors are sufficient.
##
## The degrees of freedom for the null model are 55 and the objective function was 2.91 with Chi Square of 8124.83
## The degrees of freedom for the model are 17 and the objective function was 0.03
##
## The root mean square of the residuals (RMSR) is 0.01
## The df corrected root mean square of the residuals is 0.02
##
## The harmonic number of observations is 2761 with the empirical chi square 43.29 with prob < 0.00043
## The total number of observations was 2800 with Likelihood Chi Square = 75.18 with prob < 2.7e-09
##
## Tucker Lewis Index of factoring reliability = 0.977
## RMSEA index = 0.035 and the 90 % confidence intervals are 0.027 0.043
## BIC = -59.75
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## ML3 ML4 ML1 ML2
## Correlation of (regression) scores with factors 0.87 0.85 0.93 0.92
## Multiple R square of scores with factors 0.75 0.72 0.86 0.86
## Minimum correlation of possible factor scores 0.51 0.43 0.72 0.71
### Interpretation:
# F1: N1/N2/N3: 'emotional fluctuation'
# F2: A2/A3/A5: 'Empathy'
# F3: C1/C2/C3: 'Perfektionism'
# -> C3 and C4 small loadings, not included here
# F4: C5: 'wasting my time', not interpretable
#
# -> Conclude: Subsetting of C-Items seeems not to be reasonable by content
### (i) Decision on number Factor
fa.parallel(bfi)## Parallel analysis suggests that the number of factors = 3 and the number of components = 3
# -> Comnpare Versions with 2, 3 & 4 F
f2 <- fa(bfi, nfactors = 2, rotate = "oblimin", fm = "ml")
f3 <- fa(bfi, nfactors = 3, rotate = "oblimin", fm = "ml")
f4 <- fa(bfi, nfactors = 4, rotate = "oblimin", fm = "ml")
fa.diagram(f2)### (ii) Rotation techniques
# orthogonal
f_varimax <- fa(bfi, nfactors = 3, rotate = "varimax", fm = "ml")
f_quartimax <- fa(bfi, nfactors = 3, rotate = "quartimax", fm = "ml")
f_equamax <- fa(bfi, nfactors = 3, rotate = "equamax", fm = "ml")
# oblique
f_oblimin <- fa(bfi, nfactors = 3, rotate = "oblimin", fm = "ml")
f_promax <- fa(bfi, nfactors = 3, rotate = "promax", fm = "ml")
fa.diagram(f_varimax, main = "Varimax")### Conclusion:
# 1) All orthogonal technique seem to yield identical results.
# Same for oblique. Both holds in general. If no general constraints, perform just one orth./obl FA
# and compare results
# 2) Both orthogonal and oblique yield nearly identical results .
# Reason: nearly no correlation among F as depicted in plot for oblimin and promax. Both results
# would differ if F correlated
# -> Conclude: Here orth is fine.
# -> We keep using oblimin信度最早由斯皮尔曼(Spearman)于1904年将其引入心理测量,指的是测验结果的一致性程度或可靠性程度。根据所关心的重点不同,信度可分为内在和外在信度两类。
内在信度指调查表中的一组问题是否测量的是同一个概念,也就是这些问题之间的内在一致性如何。最常用的内在信度指标为克朗巴哈系数和折半信度。最常用的外在信度指标是重测信度,即用同一问卷在不同时间对同一对象进行重复测量,然后计算一致程度
在实际研究中,很多事物/态度是不能直接被测量的,研究者们常设计一组题目间接反映它们的真实情况。但这些题目是否可以实现研究目的,就需要我们通过统计手段进一步分析了。如在本研究中,研究者设计了间接测量员工工作热情的6个题目,并希望判断它们的一致性。针对这种情况,我们可以使用Cronbach’s α分析。
解释:Cronbach’s α分析主要用于评价连续变量和有序分类变量的一致性,适用于本研究的研究数据。
Quelle: https://www.sohu.com/a/204514060_489312
##
## Reliability analysis
## Call: alpha(x = bfi[, c("A2", "A3", "A5")])
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.72 0.72 0.64 0.46 2.6 0.0091 4.7 1 0.49
##
## lower alpha upper 95% confidence boundaries
## 0.7 0.72 0.74
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## A2 0.67 0.67 0.51 0.51 2.0 0.012 NA 0.51
## A3 0.56 0.56 0.39 0.39 1.3 0.017 NA 0.39
## A5 0.65 0.65 0.49 0.49 1.9 0.013 NA 0.49
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## A2 2773 0.77 0.78 0.60 0.51 4.8 1.2
## A3 2774 0.84 0.83 0.70 0.59 4.6 1.3
## A5 2784 0.79 0.79 0.62 0.52 4.6 1.3
##
## Non missing response frequency for each item
## 1 2 3 4 5 6 miss
## A2 0.02 0.05 0.05 0.20 0.37 0.31 0.01
## A3 0.03 0.06 0.07 0.20 0.36 0.27 0.01
## A5 0.02 0.07 0.09 0.22 0.35 0.25 0.01
##
## Reliability analysis
## Call: alpha(x = bfi[, c("N1", "N2", "N3")])
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.82 0.82 0.76 0.6 4.6 0.006 3.2 1.3 0.56
##
## lower alpha upper 95% confidence boundaries
## 0.81 0.82 0.83
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## N1 0.71 0.71 0.55 0.55 2.4 0.0110 NA 0.55
## N2 0.71 0.71 0.56 0.56 2.5 0.0108 NA 0.56
## N3 0.83 0.83 0.71 0.71 4.8 0.0065 NA 0.71
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## N1 2778 0.88 0.88 0.80 0.72 2.9 1.6
## N2 2779 0.87 0.88 0.80 0.71 3.5 1.5
## N3 2789 0.82 0.82 0.65 0.60 3.2 1.6
##
## Non missing response frequency for each item
## 1 2 3 4 5 6 miss
## N1 0.24 0.24 0.15 0.19 0.12 0.07 0.01
## N2 0.12 0.19 0.15 0.26 0.18 0.10 0.01
## N3 0.18 0.23 0.13 0.21 0.16 0.09 0.00
# -> fine (alpha = 0,82)
### C-Items
# Important: We observe neg q for C4 and C5 -> need to be modified in advance
bfi_new <- bfi
bfi_new$C4_rev <- 7 - bfi_new$C4
bfi_new$C5_rev <- 7 - bfi_new$C5
alpha(bfi_new[,c("C1","C2","C3","C4_rev","C5_rev")])##
## Reliability analysis
## Call: alpha(x = bfi_new[, c("C1", "C2", "C3", "C4_rev", "C5_rev")])
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.73 0.73 0.69 0.35 2.7 0.0081 4.3 0.95 0.34
##
## lower alpha upper 95% confidence boundaries
## 0.71 0.73 0.74
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## C1 0.69 0.70 0.64 0.36 2.3 0.0093 0.0037 0.35
## C2 0.67 0.67 0.62 0.34 2.1 0.0099 0.0056 0.34
## C3 0.69 0.69 0.64 0.36 2.3 0.0096 0.0070 0.36
## C4_rev 0.65 0.66 0.60 0.33 2.0 0.0107 0.0037 0.32
## C5_rev 0.69 0.69 0.63 0.36 2.2 0.0096 0.0017 0.35
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## C1 2779 0.65 0.67 0.54 0.45 4.5 1.2
## C2 2776 0.70 0.71 0.60 0.50 4.4 1.3
## C3 2780 0.66 0.67 0.54 0.46 4.3 1.3
## C4_rev 2774 0.74 0.73 0.64 0.55 4.4 1.4
## C5_rev 2784 0.72 0.68 0.57 0.48 3.7 1.6
##
## Non missing response frequency for each item
## 1 2 3 4 5 6 miss
## C1 0.03 0.06 0.10 0.24 0.37 0.21 0.01
## C2 0.03 0.09 0.11 0.23 0.35 0.20 0.01
## C3 0.03 0.09 0.11 0.27 0.34 0.17 0.01
## C4_rev 0.02 0.08 0.16 0.17 0.29 0.28 0.01
## C5_rev 0.10 0.17 0.22 0.12 0.20 0.18 0.01
rm(list = ls())
library(sem)
library(semPlot)
library(corrplot)
library(DiagrammeR)
# Duncan, Haller, and Portes's nonrecursive peer-influences model
R.DHP <- readMoments(diag=FALSE, names=c('ROccAsp', 'REdAsp', 'FOccAsp',
'FEdAsp', 'RParAsp', 'RIQ', 'RSES',
'FSES', 'FIQ', 'FParAsp'), text = "
.6247
.3269 .3669
.4216 .3275 .6404
.2137 .2742 .1124 .0839
.4105 .4043 .2903 .2598 .1839
.3240 .4047 .3054 .2786 .0489 .2220
.2930 .2407 .4105 .3607 .0186 .1861 .2707
.2995 .2863 .5191 .5007 .0782 .3355 .2302 .2950
.0760 .0702 .2784 .1988 .1147 .1021 .0931 -.0438 .2087
")
R.DHP## ROccAsp REdAsp FOccAsp FEdAsp RParAsp RIQ RSES FSES FIQ
## ROccAsp 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
## REdAsp 0.6247 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
## FOccAsp 0.3269 0.3669 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
## FEdAsp 0.4216 0.3275 0.6404 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000
## RParAsp 0.2137 0.2742 0.1124 0.0839 1.0000 0.0000 0.0000 0.0000 0.0000
## RIQ 0.4105 0.4043 0.2903 0.2598 0.1839 1.0000 0.0000 0.0000 0.0000
## RSES 0.3240 0.4047 0.3054 0.2786 0.0489 0.2220 1.0000 0.0000 0.0000
## FSES 0.2930 0.2407 0.4105 0.3607 0.0186 0.1861 0.2707 1.0000 0.0000
## FIQ 0.2995 0.2863 0.5191 0.5007 0.0782 0.3355 0.2302 0.2950 1.0000
## FParAsp 0.0760 0.0702 0.2784 0.1988 0.1147 0.1021 0.0931 -0.0438 0.2087
## FParAsp
## ROccAsp 0
## REdAsp 0
## FOccAsp 0
## FEdAsp 0
## RParAsp 0
## RIQ 0
## RSES 0
## FSES 0
## FIQ 0
## FParAsp 1
Show in plot
model.DHP.2 <- specifyEquations(covs="RGenAsp, FGenAsp", text = "
RGenAsp = gam11*RParAsp + gam12*RIQ + gam13*RSES + gam14*FSES + beta12*FGenAsp
FGenAsp = gam23*RSES + gam24*FSES + gam25*FIQ + gam26*FParAsp + beta21*RGenAsp
ROccAsp = 1*RGenAsp
REdAsp = lam21(1)*RGenAsp
FOccAsp = 1*FGenAsp
FEdAsp = lam42(1)*FGenAsp
") ## NOTE: adding 4 variances to the model
sem.DHP.2 <- sem(model.DHP.2, S=R.DHP, N=329,
fixed.x=c('RParAsp', 'RIQ', 'RSES', 'FSES', 'FIQ', 'FParAsp'))
summary(sem.DHP.2)##
## Model Chisquare = 26.69722 Df = 15 Pr(>Chisq) = 0.03130238
## AIC = 64.69722
## BIC = -60.24365
##
## Normalized Residuals
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.79953 -0.11783 0.00000 -0.01201 0.03974 1.56525
##
## R-square for Endogenous Variables
## RGenAsp FGenAsp ROccAsp REdAsp FOccAsp FEdAsp
## 0.5220 0.6170 0.5879 0.6639 0.6888 0.5954
##
## Parameter Estimates
## Estimate Std Error z value Pr(>|z|)
## gam11 0.16122238 0.03879229 4.1560415 3.238091e-05
## gam12 0.24964951 0.04398093 5.6763131 1.376288e-08
## gam13 0.21840339 0.04419737 4.9415476 7.750487e-07
## gam14 0.07183929 0.04970696 1.4452563 1.483859e-01
## beta12 0.18423232 0.09488787 1.9415793 5.218805e-02
## gam23 0.06188691 0.05171968 1.1965835 2.314690e-01
## gam24 0.22886711 0.04416218 5.1824232 2.190216e-07
## gam25 0.34903540 0.04528979 7.7067130 1.290997e-14
## gam26 0.15953413 0.03882593 4.1089578 3.974486e-05
## beta21 0.23547779 0.11938929 1.9723527 4.856936e-02
## lam21 1.06267767 0.09013865 11.7893677 4.428532e-32
## lam42 0.92972555 0.07028108 13.2286752 5.993451e-40
## V[RGenAsp] 0.28098694 0.04623153 6.0778199 1.218274e-09
## C[RGenAsp,FGenAsp] -0.02260935 0.05119391 -0.4416413 6.587488e-01
## V[FGenAsp] 0.26383537 0.04466688 5.9067334 3.489577e-09
## V[ROccAsp] 0.41214523 0.05122464 8.0458399 8.565593e-16
## V[REdAsp] 0.33614544 0.05209991 6.4519386 1.104283e-10
## V[FOccAsp] 0.31119476 0.04592712 6.7758385 1.236868e-11
## V[FEdAsp] 0.40460387 0.04618438 8.7606206 1.941783e-18
##
## gam11 RGenAsp <--- RParAsp
## gam12 RGenAsp <--- RIQ
## gam13 RGenAsp <--- RSES
## gam14 RGenAsp <--- FSES
## beta12 RGenAsp <--- FGenAsp
## gam23 FGenAsp <--- RSES
## gam24 FGenAsp <--- FSES
## gam25 FGenAsp <--- FIQ
## gam26 FGenAsp <--- FParAsp
## beta21 FGenAsp <--- RGenAsp
## lam21 REdAsp <--- RGenAsp
## lam42 FEdAsp <--- FGenAsp
## V[RGenAsp] RGenAsp <--> RGenAsp
## C[RGenAsp,FGenAsp] FGenAsp <--> RGenAsp
## V[FGenAsp] FGenAsp <--> FGenAsp
## V[ROccAsp] ROccAsp <--> ROccAsp
## V[REdAsp] REdAsp <--> REdAsp
## V[FOccAsp] FOccAsp <--> FOccAsp
## V[FEdAsp] FEdAsp <--> FEdAsp
##
## Iterations = 32